Abstract
This work examines …In 1975, computer scientist Jacques Valleé and his colleague Claude Poher presented a conference paper on the results of applying computational statistics to a database of UFO observations. They …
They investigate:
There data set consisted of a set of sightings from France dating to X, and a set of set of sightings from outside of France.
This work attempts to replicate Vallee and Poher’s findings, using a data set of sightings from the continental United States in the twenty-first century. Most of the original figures are able to be replicated. Weather data was retrieved from…
Weather conditions at time of observation is the first question the authors tackle. It is reasonable to assume that if observers really are seeing something, we should see more sightings when the skyer is clearer versus when it is not.
Vallee and Poher use an astronomer’s measure, atmospheric transparency, as a way to quantify conditions. When clouds, haze, fog, etc., are present in the atmosphere, atmospheric transparency decreases, making it harder to see objects especially from further away. The telescope company Celestron provides a short explanation of atmospheric transparency.
While there are resources for amateur astronomers that forecast atmospheric transparency over a wide area, the data is not easily accessible, especially decades back and over a many geographies. Instead, I will focus on visibility, in the same sense as often given by standard weather reports:
the greatest distance at which a black object of suitable dimensions, situated near the ground, can be seen and recognized when observed against a bright background. (International Civil Aviation Organization, via Wikipedia)
In the figure below, the left plot is from Vallee and Poher, showing the relationship the number of sightings by atmospheric transparency. My own, on the right, plots the same but for transparency.
(Note that I have capped all visibility greater than 10 as 10.)
The two plots support approximately the same conclusion: Sightings are much more likely in very good conditions than in poor atmospheric conditions.
However, Vallee and Poher’s curve is not so sharp as mine, and I am unsure what the ‘theoretical curve’ refers to. The authors remark only that the empirical curve is consistent with ‘the model of the human vision for equidistributed luminous objects in the atmosphere.’
The authors also evaluate number of reports as a function of the angular elevation of the UAP relative to the observer. Unfortunately, this variable was not clearly broken out in my own data.
In this section, the authors examine the duration of sightings, and the observer’s reported distance from the object. Together with visibility, Vallee and Poher are able to compare the characteristics of the average UAP with the characteristics of known objects like balloons, planes, etc. They determine that UAP do not match the characteristics of more recognizable phenomena. A reasonable person may conclude that this rules out manmade objects, meteors, etc., for many UAP reports.
Observer distance from the object is not easily available in the gathered data set, so I will focus on duration of sighting only.
The NUFORC dataset has about 63 thousand cases that contain clear description of the duration. An attempt was made to standardize the durations (‘5 minutes’, ‘30-40 minutes’, ‘an hour’) into seconds using regular expressions. Like the original plot, the distribution is log transformed, but the x-axis has been back-transformed to the linear scale.
It is clear there is good agreement between the unbroken black line on the left and my plot on the right. Both distributions are approximately normal, although mine has a ‘hump’ on the left, indicating more observations in the ~10 second range than expected assuming normality. The median of the NUFORC data is 180 seconds, which appears to be just slightly higher than the median of Vallee and Poher’s plot, somewhere just north of 100 seconds. (Although it is hard to expect exactness in an old-fashioned plot like this!)
Vallee and Poher have included a second, broken line on their plot. The 350 cases actually refers to identified objects (despite the typo in the legend). There is thus an immense difference between unidentified and identified cases. The identified cases tend to be observed for less than 10 seconds, or greater than 1000 seconds, but only rarely in between. In contrast, UAP are recorded of having a distance in between.
This difference, along with that of distance and visibility, lead the author to conclude UAP cannot broadly be dismissed as normal sky objects (meteor, jet, etc.).
The distribution of sightings by time of day is reportedly very stable. The authors’ plot is below, to the left, using several data sets over several times. My own is to the right. All the densities are approximately the same, peaking around 9 p.m.
I examined my the NUFORC dataset, sampling one year every five. Clearly there is some variation from year to year. Through 2015, there was a trend toward sightings being more concentrated at 9 p.m., less dispersed over the other hours. But before and since then, sightings are being observed over a wider distribution of time.
This can be further explored by comparing the distribution to the percentage of people at home at any given hour. Vallee and Poher use this information to ‘reconstruc[t]… the time distribution of type-1 events taking into account the number of potential witnesses’:
All factors being held constant, witnesses are only in a position to observe one in fourteen close approaches of the earth. In order to generate the 2000 close-encounter observations we have in our files, the phenomenon would have had to manifest itself close to the ground 28,000 times during the time interval and in the regions considered here.
Although interesting, I don’t attempt to duplicate this inference here. However, it is still of some interest to update their Figure 8—if perhaps only sociologically.
The U.S. Bureau of Statistics collects data on how Americans spend their time. It is collated into thethe American Time Use Survey. I use their (table A-3A). It is aggregated to provide approximately "percentage of the working population not at home, i.e., the percentage of the population that is both i) not sleeping, and ii) not at work or school. Note that this time use data is from between 2015–2019, so I only use those years of the sighting database as well:
Notice that the original publication represents this data with two \(y\) axes. This is frowned upon now, but must have seemed like an excellent way to either reinforce the relationship, or else to save ink and space. Instead, I have opted to plot both side-by-side. We observe approximately the same relationhip, although it is nowhere near as clean as the original. The correlation coefficient between the two percentages is 0.26.